Integrating digital gait data with metabolomics and clinical data to predict outcomes in Parkinson’s disease

Parkinson’s disease (PD) presents diverse symptoms and comorbidities, complicating its diagnosis and management. The primary objective of this cross-sectional, monocentric study was to assess digital gait sensor data’s utility for monitoring and diagnosis of motor and gait impairment in PD. As a secondary objective, for the more challenging tasks of detecting comorbidities, non-motor outcomes, and disease progression subgroups, we evaluated for the first time the integration of digital markers with metabolomics and clinical data. Using shoe-attached digital sensors, we collected gait measurements from 162 patients and 129 controls in a single visit. Machine learning models showed significant diagnostic power, with AUC scores of 83–92% for PD vs. control and up to 75% for motor severity classification. Integrating gait data with metabolomics and clinical data improved predictions for challenging-to-detect comorbidities such as hallucinations. Overall, this approach using digital biomarkers and multimodal data integration can assist in objective disease monitoring, diagnosis, and comorbidity detection.

The value that appears most frequently in the time series data.

Skewness
A measure of the asymmetry of the probability distribution of the time series data.

Kurtosis
A measure of the "tailedness" of the probability distribution of the time series data.

Interquartile Range
The range between the 25th and 75th percentile.It measures the statistical dispersion of the data.

Range
The difference between the maximum and minimum values in the time series data.

Correlation Measures 9 Autocorrelation
A measure of how related a variable is with a lagged version of itself.
Page 4 10 Correlations The Pearson and Spearman correlation coefficients between different axes of the time series data.

Energy and Frequency
Features Zero Crossing Rate The rate at which the signal changes from positive to negative or back.
12 Spectral Entropy A measure of the complexity or randomness of a signal, calculated using the spectral density.The maximum change in the mean value of the time series over a defined window.

Variance Change
The maximum change in the variance of the time series over a defined window.

Time Lag
The time delay at which the autocorrelation of the signal is significant.
Supplementary Table 2: Tabular overview of the generic features computed to characterize spatial time series measurements, covering 22 features in total.Column 1 groups the features into five categories: Statistical moments, correlation measures, energy and frequency features, structural features, and dynamic features; column 2 lists the specific type of each individual feature; column 3 provides a brief description.
For the detailed computation of all features, see section "Code availability".
Supplementary Data 1: Full list of metabolites studied.The full list of studied metabolites, including public database IDs, chemical properties, and associated biochemical pathways (provided as a separate dataset file for further editing and processing by the reader).
Cognitive Assessment (MoCA) score outcomes using three data modalities (gait-specific digital biomarker features, clinical features, and metabolomics features) and extreme gradient boosting for machine learning.
The color coding from purple to yellow represents the feature value range from low to high.The labels on the left correspond to the individual features that were most predictive in terms of the absolute SHAP value, sorted from top to bottom (corresponding absolute SHAP values are shown in bold on the left side of the plot).Feature labels starting with the label "Left" or "Right" represent digital gait sensor features measured on the left or right shoe, respectively ("gyro" stands for gyrometer; "accel" for accelerometer measurements; TUG stands for the "Timed Up and Go" walking exercise; the remaining parts of the labels reflect the feature types covered in Supplementary Table 2).Feature labels starting with the "X." and followed only by a number rather than a metabolite name represent metabolomics features where the corresponding metabolite identity is unknown.Other features correspond to identified metabolites or clinical variables (e.g., the third top-ranked feature is the age of the patient).
Depression Inventory (BDI-I) outcome scores using three data modalities (gait-specific digital biomarker features, clinical features, and metabolomics features) and extreme gradient boosting for machine learning.
The color coding from purple to yellow represents the feature value range from low to high.The labels on the left correspond to the individual features that were most predictive in terms of the absolute SHAP value, sorted from top to bottom (corresponding absolute SHAP values are shown in bold on the left side of the plot).Feature labels starting with the label "Left" or "Right" represent digital gait sensor features measured on the left or right shoe, respectively ("gyro" stands for gyrometer; "accel" for accelerometer measurements; TUG stands for the "Timed Up and Go" walking exercise; the remaining parts of the labels reflect the feature types covered in Supplementary Table 2).Feature labels starting with the "X." and followed only by a number rather than a metabolite name represent metabolomics features where the corresponding metabolite identity is unknown.Other features correspond to identified metabolites or clinical variables (e.g., the topranked feature is the score for the "Sniffin' Sticks" smell test that is conducted as part of the clinical examination).
calculated by dividing the stride length by the stride time.Std Gait Speed m/s Standard deviation of the average walking speed.Mean Turning Angle deg The angle between the direction of the last swing phase and the orientation of the foot in the next stance phase.Std Turning Angle deg Standard deviation of the turning angle between swing phases.Mean Toe Off Angle deg The angle between the heel and the surface at the beginning of the swing phase.Std Toe Off Angle deg Standard deviation of the toe off angle at the beginning of swing phases.Mean Heel Strike Angle deg The angle between the toes and the surface when the foot lands.